automl tool
LightAutoDS-Tab: Multi-AutoML Agentic System for Tabular Data
Lapin, Aleksey, Hromov, Igor, Chumakov, Stanislav, Mitrovic, Mile, Simakov, Dmitry, Nikitin, Nikolay O., Savchenko, Andrey V.
AutoML has advanced in handling complex tasks using the integration of LLMs, yet its efficiency remains limited by dependence on specific underlying tools. In this paper, we introduce LightAutoDS-Tab, a multi-AutoML agentic system for tasks with tabular data, which combines an LLM-based code generation with several AutoML tools. Our approach improves the flexibility and robustness of pipeline design, outperforming state-of-the-art open-source solutions on several data science tasks from Kaggle. The code of LightAutoDS-Tab is available in the open repository https://github.com/sb-ai-lab/LADS
ZeroML: A Next Generation AutoML Language
ZeroML is a new generation programming language for AutoML to drive the ML pipeline in a compiled and multi-paradigm way, with a pure functional core. Meeting the shortcomings introduced by Python, R, or Julia such as slow-running time, brittle pipelines or high dependency cost ZeroML brings the Microservices-based architecture adding the modular, reusable pieces such as DataCleaner, FeatureEngineer or ModelSelector. As a native multithread and memory-aware search optimized toolkit, and with one command deployability ability, ZeroML ensures non-coders and ML professionals to create high-accuracy models super fast and in a more reproducible way. The verbosity of the language ensures that when it comes to dropping into the backend, the code we will be creating is extremely clear but the level of repetition and boilerplate required when developing on the front end is now removed.
Position: A Call to Action for a Human-Centered AutoML Paradigm
Lindauer, Marius, Karl, Florian, Klier, Anne, Moosbauer, Julia, Tornede, Alexander, Mueller, Andreas, Hutter, Frank, Feurer, Matthias, Bischl, Bernd
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML) workflows, aiding the research of new ML algorithms, and contributing to the democratization of ML by making it accessible to a broader audience. Over the past decade, commendable achievements in AutoML have primarily focused on optimizing predictive performance. This focused progress, while substantial, raises questions about how well AutoML has met its broader, original goals. In this position paper, we argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems, including their diverse roles, expectations, and expertise. We envision a more human-centered approach in future AutoML research, promoting the collaborative design of ML systems that tightly integrates the complementary strengths of human expertise and AutoML methodologies.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (8 more...)
- Research Report (1.00)
- Overview (0.93)
- Government (0.93)
- Health & Medicine (0.93)
- Law (0.67)
- Education (0.67)
Democratize with Care: The need for fairness specific features in user-interface based open source AutoML tools
AI is increasingly playing a pivotal role in businesses and organizations, impacting the outcomes and interests of human users. Automated Machine Learning (AutoML) streamlines the machine learning model development process by automating repetitive tasks and making data-driven decisions, enabling even non-experts to construct high-quality models efficiently. This democratization allows more users (including non-experts) to access and utilize state-of-the-art machine-learning expertise. However, AutoML tools may also propagate bias in the way these tools handle the data, model choices, and optimization approaches adopted. We conducted an experimental study of User-interface-based open source AutoML tools (DataRobot, H2O Studio, Dataiku, and Rapidminer Studio) to examine if they had features to assist users in developing fairness-aware machine learning models. The experiments covered the following considerations for the evaluation of features: understanding use case context, data representation, feature relevance and sensitivity, data bias and preprocessing techniques, data handling capabilities, training-testing split, hyperparameter handling, and constraints, fairness-oriented model development, explainability and ability to download and edit models by the user. The results revealed inadequacies in features that could support in fairness-aware model development. Further, the results also highlight the need to establish certain essential features for promoting fairness in AutoML tools.
- Information Technology (0.69)
- Health & Medicine (0.48)
Assessing the Use of AutoML for Data-Driven Software Engineering
Calefato, Fabio, Quaranta, Luigi, Lanubile, Filippo, Kalinowski, Marcos
Background. Due to the widespread adoption of Artificial Intelligence (AI) and Machine Learning (ML) for building software applications, companies are struggling to recruit employees with a deep understanding of such technologies. In this scenario, AutoML is soaring as a promising solution to fill the AI/ML skills gap since it promises to automate the building of end-to-end AI/ML pipelines that would normally be engineered by specialized team members. Aims. Despite the growing interest and high expectations, there is a dearth of information about the extent to which AutoML is currently adopted by teams developing AI/ML-enabled systems and how it is perceived by practitioners and researchers. Method. To fill these gaps, in this paper, we present a mixed-method study comprising a benchmark of 12 end-to-end AutoML tools on two SE datasets and a user survey with follow-up interviews to further our understanding of AutoML adoption and perception. Results. We found that AutoML solutions can generate models that outperform those trained and optimized by researchers to perform classification tasks in the SE domain. Also, our findings show that the currently available AutoML solutions do not live up to their names as they do not equally support automation across the stages of the ML development workflow and for all the team members. Conclusions. We derive insights to inform the SE research community on how AutoML can facilitate their activities and tool builders on how to design the next generation of AutoML technologies.
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada (0.04)
- Europe > Germany (0.04)
- (6 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
The Interplay Between AI and Business Objectives
"The best way to have a good idea is to have a lot of ideas." Let's assume we are running an e-commerce search engine that uses machine learning on user-issued queries to identify the intended product category. Say the model in production incurs a 20ms prediction latency and has 90% accuracy. A natural next goal from a modeling perspective would be to drive the accuracy higher, say to 95% or beyond. However, we know that improving the accuracy almost always requires the consumption of more computational resources for training models and may also increase the inference latency.
Automating the Automators: Shift Change in the Robot Factory – O'Reilly
What would you say is the job of a software developer? A layperson, an entry-level developer, or even someone who hires developers will tell you that job is to … well … write software. An experienced practitioner will tell you something very different. They'd say that the job involves writing some software, sure. Figuring out what kinds of problems are amenable to automation through code. Knowing what to build, and sometimes what not to build because it won't provide value.
WALTS: Walmart AutoML Libraries, Tools and Services
Automated Machine Learning (AutoML) is an upcoming field in machine learning (ML) that searches the candidate model space for a given task, dataset and an evaluation metric and returns the best performing model on the supplied dataset as per the given metric. AutoML not only reduces the manpower and expertise needed to develop ML models but also decreases the time-to-market for ML models substantially. We have designed an enterprise-scale AutoML framework called WALTS to meet the rising demand of employing ML in retail or any other business of interest, and thus help democratize ML within our organization. In this blog, we elaborate on how we explore models from a pool of candidates and underline how it has helped us with a business use-case. To give an overview of the AutoML process, its current landscape, and showcase the benefits of WALTS, we will be covering: · What is AutoML?
- Information Technology (0.49)
- Retail (0.42)
AutoML: the Promise vs. Reality According to Practitioners
The current conversation about automated machine learning (AutoML) is a blend of hope and frustration. Automation to improve machine learning projects comes from a noble goal. By streamlining development, ML projects can be put in the hands of more people, including those who do not have years of data science training and a data center at their disposal. End-to-end automation, while it may be promised by some providers, is not available yet. There are capabilities in AutoML, particularly in modeling tasks, that practitioners from novice to advanced data scientists are using today to enhance their work.
An Empirical Study on the Usage of Automated Machine Learning Tools
Majidi, Forough, Openja, Moses, Khomh, Foutse, Li, Heng
The popularity of automated machine learning (AutoML) tools in different domains has increased over the past few years. Machine learning (ML) practitioners use AutoML tools to automate and optimize the process of feature engineering, model training, and hyperparameter optimization and so on. Recent work performed qualitative studies on practitioners' experiences of using AutoML tools and compared different AutoML tools based on their performance and provided features, but none of the existing work studied the practices of using AutoML tools in real-world projects at a large scale. Therefore, we conducted an empirical study to understand how ML practitioners use AutoML tools in their projects. To this end, we examined the top 10 most used AutoML tools and their respective usages in a large number of open-source project repositories hosted on GitHub. The results of our study show 1) which AutoML tools are mostly used by ML practitioners and 2) the characteristics of the repositories that use these AutoML tools. Also, we identified the purpose of using AutoML tools (e.g. model parameter sampling, search space management, model evaluation/error-analysis, Data/ feature transformation, and data labeling) and the stages of the ML pipeline (e.g. feature engineering) where AutoML tools are used. Finally, we report how often AutoML tools are used together in the same source code files. We hope our results can help ML practitioners learn about different AutoML tools and their usages, so that they can pick the right tool for their purposes. Besides, AutoML tool developers can benefit from our findings to gain insight into the usages of their tools and improve their tools to better fit the users' usages and needs.
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France (0.04)